Structured Support Vector Machines for Speech Recognition

نویسنده

  • Shi-Xiong ZHANG
چکیده

Discriminative training criteria and discriminative models are two ešective improvements for HMM-based speech recognition. is thesis proposed a structured support vector machine (SSVM) framework suitable for medium to large vocabulary continuous speech recognition. An important aspect of structured SVMs is the form of features. Several previously proposed features in the eld are summarized in this framework. Since some of these features can be extracted based on generative models, this provides an elegant way of combine generative and discriminative models. To apply the structured SVMs to continuous speech recognition, a number of issues need to be addressed. First, features require a segmentation to be specied. To incorporate the optimal segmentation into the training process, the training algorithm is modied making use of the concave-convex optimisation procedure. A Viterbi-style algorithm is described for inferring the optimal segmentation based on discriminative parameters. Second, structured SVMs can be viewed as large margin log linear models using a zero mean Gaussian prior of the discriminative parameter. However this form of prior is not appropriate for all features. An extended training algorithm is proposed that allows general Gaussian priors to be incorporated into the large margin criterion. ird, to speed up the training process, strategies of parameter tying, 1-slack optimisation, caching competing hypotheses, lattice constrained search and parallelization, are also described. Finally, to avoid explicitly computing in the high dimensional feature space and to achieve the nonlinear decision boundaries, kernel based training and decoding algorithms are also proposed. e performance of structured SVMs is evaluated on small and medium to large speech recognition tasks: AURORA 2 and 4.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Face Recognition using Eigenfaces , PCA and Supprot Vector Machines

This paper is based on a combination of the principal component analysis (PCA), eigenface and support vector machines. Using N-fold method and with respect to the value of N, any person’s face images are divided into two sections. As a result, vectors of training features and test features are obtain ed. Classification precision and accuracy was examined with three different types of kernel and...

متن کامل

Structured Support Vector Machines for Noise Robust Continuous Speech Recognition

The use of discriminative models is an interesting alternative to generative models for speech recognition. This paper examines one form of these models, structured support vector machines (SVMs), for noise robust speech recognition. One important aspect of structured SVMs is the form of the joint feature space. In this work features based on generative models are used, which allows model-based...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Visual Speech Recognition Using Support Vector Machines

In this paper we propose a visual speech recognition network based on Support Vector Machines. Each word of the dictionary is described as a temporal sequence of visemes. Each viseme is described by a support vector machine, and the temporal character of speech is modeled by integrating the support vector machines as nodes into a Viterbi decoding lattice. Experiments conducted on a small visual...

متن کامل

Application of support vector machines classifiers to visual speech recognition

In this paper we proposed a visual speech recognition network based on Support Vector Machines. Each word of the dictionary is modeled by a set of temporal sequences of visemes. Each viseme is described by a support vector machine, and the temporal character of speech is modeled by integrating the support vector machines as nodes into Viterbi decoding lattices. Experiments conducted on a small ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014